A measure of compression gain for new symbols in data-compression
نویسنده
چکیده
In coding theory it is widely known that the optimal encoding for a given alphabet of symbol codes is the Shannon entropy times the number of symbols to be encoded. However, depending on the structure of the message to be encoded it is possible to beat this optimal by including only frequently occurring aggregates of symbols from the base alphabet. We prove that the change in compressed message length by the introduction of a new aggregate symbol can be expressed as the difference of two entropies, dependent only on the probabilities of the characters within the aggregate plus a correction term which involves only the probability and length of the introduced symbol. The expression is independent of the probability of all other symbols in the alphabet. This measure of information gain, for a new symbol, can be applied in data compression methods.
منابع مشابه
A Novel Data Compression Technique for 420 Ma Current Loop Transmitters
This paper presents a new data compression method for current loop transmitters. In this method, the 4-20 mA current domain is divided into some equal pieces that are used for distinct data domain with a constant relative resolution, resulting in widening the signal span. This technique eliminated the need for high resolution ADC’s or DAC’s in communication of 4-20mA current loop signals. Furth...
متن کاملGenetic Algorithms in Syllable-Based Text Compression
Syllable based text compression is a new approach to compression by symbols. In this concept syllables are used as the compression symbols instead of the more common characters or words. This new technique has proven itself worthy especially on short to middle-length text files. The effectiveness of the compression is greatly affected by the quality of dictionaries of syllables characteristic f...
متن کاملAdaptive Heuristic Search for Soft-Input Soft-Output Decoding of Arithmetic Codes
Arithmetic coding for data compression has gained widespread acceptance for optimum compression when used in a suitable model. The "traditional" hard bit based decoding suffers from severe loss of data when errors are present in the received bitstream. In this paper, heuristic-search based algorithms are introduced for the decoding of arithmetic coded bitstreams. They utilize both channel "soft...
متن کاملCorrecting the stress-strain curve in hot compression test using finite element analysis and Taguchi method
In the hot compression test friction has a detrimental influence on the flow stress through the process and therefore, correcting the deformation curve for real behavior is very important for both researchers and engineers. In this study, a series of compression tests were simulated using Abaqus software. In this study, it has been employed the Taguchi method to design experiments by the factor...
متن کاملCumulative Fatigue Damage Under stepwise Tension-Compression Loading
Rock structures are subjected to cyclic tension-compression loading due to a blasting, earthquake, traffic and injection-production in underground storage case. Therefore study the fatigue behavior of rock samples under this type of loading is required. In this study, the accumulated fatigue damage for a Green Onyx rock sample which consisted of only one mineral composition with two-step high-l...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1402.4738 شماره
صفحات -
تاریخ انتشار 2014